Multi-Label Classification: Inconsistency and Class Balanced K-Nearest Neighbor
نویسندگان
چکیده
Many existing approaches employ one-vs-rest method to decompose a multi-label classification problem into a set of 2class classification problems, one for each class. This method is valid in traditional single-label classification, it, however, incurs training inconsistency in multi-label classification, because in the latter a data point could belong to more than one class. In order to deal with this problem, in this work, we further develop classical K-Nearest Neighbor classifier and propose a novel Class Balanced K-Nearest Neighbor approach for multi-label classification by emphasizing balanced usage of data from all the classes. In addition, we also propose a Class Balanced Linear Discriminant Analysis approach to address high-dimensional multi-label input data. Promising experimental results on three broadly used multi-label data sets demonstrate the effectiveness of our approach.
منابع مشابه
Multi-label Classification: Inconsistency, Ambiguity and Class Balanced KNN Classification
Many existing researches employ one-vs-others approach to decompose a multi-label classification problem into a set of 2-class classification problems, one for each class. This approach is valid in traditional single-label classification. However, it incurs training inconsistency in multi-label classification, because a multi-label data point could belong to more than one class. In this work, w...
متن کاملML-KNN: A lazy learning approach to multi-label learning
Multi-label learning originated from the investigation of text categorization problem, where each document may belong to several predefined topics simultaneously. In multi-label learning, the training set is composed of instances each associated with a set of labels, and the task is to predict the label sets of unseen instances through analyzing training instances with known label sets. In this...
متن کاملA Coupled k-Nearest Neighbor Algorithm for Multi-label Classification
ML-kNN is a well-known algorithm for multi-label classification. Although effective in some cases, ML-kNN has some defect due to the fact that it is a binary relevance classifier which only considers one label every time. In this paper, we present a new method for multi-label classification, which is based on lazy learning approaches to classify an unseen instance on the basis of its k nearest ...
متن کاملLearning Label Embeddings for Nearest-Neighbor Multi-class Classification with an Application to Speech Recognition
We consider the problem of using nearest neighbor methods to provide a conditional probability estimate, P (y|a), when the number of labels y is large and the labels share some underlying structure. We propose a method for learning label embeddings (similar to error-correcting output codes (ECOCs)) to model the similarity between labels within a nearest neighbor framework. The learned ECOCs and...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کامل